Improved modelling of speech dynamics using non-linear formant trajectories for HMM-based speech synthesis
نویسندگان
چکیده
This paper describes the use of non-linear formant trajectories to model speech dynamics. The performance of the non-linear formant dynamics model is evaluated using HMM-based speech synthesis experiments, in which the 12 dimensional parallel formant synthesiser control parameters and their time derivatives are used as the feature vectors in the HMM. Two types of formant synthesiser control parameters, named piecewise constant and smooth trajectory parameters, are used to drive the classic parallel formant synthesiser. The quality of the synthetic speech is assessed using three kinds of subjective tests. This paper shows that the non-linear formant dynamics model can improve the performance of HMM-based speech synthesis.
منابع مشابه
Towards an improved model of dynamics for speech recognition and synthesis
This thesis describes the research on the use of non-linear formant trajectories to model speech dynamics under the framework of a multiple-level segmental hidden Markov model (MSHMM). The particular type of intermediate-layer model investigated in this study is based on the 12-dimensional parallel formant synthesiser (PFS) control parameters, which can be directly used to synthesise speech wit...
متن کاملSpeech recognition using non-linear trajectories in a formant-based articulatory layer of a multiple-level segmental HMM
This paper describes how non-linear formant trajectories, based on ‘trajectory HMM’ proposed by Tokuda et al., can be exploited under the framework of multiple-level segmental HMMs. In the resultant model, named a non-linear/linear multiple-level segmental HMM, speech dynamics are modeled as non-linear smooth trajectories in the formant-based intermediate layer. These formant trajectories are m...
متن کاملAnalysis, modelling and synthesis of formants of British, American and Australian accents
The formant space of three major English accents namely British, American and Australian are modelled and used for accent conversion. Accent synthesis, through modification of the acoustic parameters of speech, provides a means for assessing the perceptual contribution of each parameter on conveying an accent. An improved method based on a linear prediction (LP) model feature analysis and a 2-D...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملFormant-tracking Linear Prediction Models for Speech Processing in Noisy Enviroments
This paper presents a formant-tracking method for estimation of the time-varying trajectories of a linear prediction (LP) model of speech in noise. The main focus of this work is on the modelling of the non-stationary temporal trajectories of the formants of speech for improved LP model estimation in noise. The proposed approach provides a systematic framework for modelling the inter-frame corr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010